Creating a New Text Classifier

To create a new text classifier/model, you need dataset. So there are two steps in creating a new Text classifier:

Preparing Datasets
Creating/Training a Classifier Model

Preparing Datasets

Navigate to Smart Bot > Classification > Text Classification.

The Text Classification page displays two tabs namely Dataset and Model.

Open the Dataset page.
Click Add Dataset.

Enter a unique Dataset Name. It should start with a letter and can contain only letters, numbers, space and underscore.
Optionally, enter an additional Description.
Select/browse Dataset file from Upload File option. The selected file must be in the ".csv" (Comma Separated Value) or ".xlsx" ( Microsoft Excel Open XML Spreadsheet) format containing two columns, Text and Class.

The first column, "Text" should include the narrative text obtained from different sources such as emails, historical records, databases, applications, or blogs. The second column "Class" should define the corresponding class of the Text. The following image shows a sample dataset used to train the Sentiment Classifier:

To prepare datasets, you can refer the section Guidelines to prepare datasets for NLP.

Creating/Training a Classifier Model

Smart Bot provides a simplified solution to train custom text classifiers requiring extensive data science knowledge and expertise. For example, you can prepare a custom model from a dataset containing sample text classified into different categories.

Model page helps to select uploaded datasets from which you can train the model.

Creating Text Classifier Models

Navigate to Smart Bot > Classification > Text Classification.

The Text Classification page displays two tabs namely Dataset and Model.

Open the Model page.
Click Add Model.

Classifier Configuration window is displayed.

Select Create.
Enter the unique Classifier Name. Classifier names cannot start with a number or contain any spaces, and it should contain only alphanumeric letters (a-z) and (0-9), or underscores (_).
Select the required dataset from the Dataset drop-down.
Select Method option from the drop-down. The supported methods are:
- Support Vector Machine (SVM)
- Multilayer Perceptron (MLP)
- Naive Bayes (NB)
- Random Forest (RF)

This selection is optional and are for advanced user. The default method is SVM.

Optionally, enter an additional Description.
Click Submit to initiate a training of the newly created model.

A model is created in the list with the status ‘In Progress’. The Smart Bot will take some time to train the classifier based on the size of the dataset. When the training is completed, Smart Bot will update the status to ’Completed’.

The Smart Bot also provides the model accuracy in terms of Accuracy score, Precision, and Recall.

Accuracy - Accuracy is the most intuitive performance measure and represents a ratio of correctly predicted observations to the total observations.
Precision - Precision is the percentage of correctly predicted positive observations to the total predicted positive observations.
Recall - Recall is the ratio of correctly predicted positive observations to all observations in the actual class.

Precision-Recall is a valuable measure of the success of prediction when the classes are very imbalanced. For example, precision measures result in relevance in information retrieval, whereas a recall measures how many truly relevant results are returned.